Recognition-based vs syntax-directed models for numerical field extraction in handwritten documents

نویسندگان

  • Clément Chatelain
  • Laurent Heutte
  • Thierry Paquet
چکیده

In this article, two different strategies are proposed for numerical field extraction in weakly constrained handwritten documents. The first extends classical handwriting recognition methods, while the second is inspired from approaches usually chosen in the field of information extraction from electronic documents. The models and the implementation of these two opposed strategies are described, and experimental results on a real handwritten mail database are presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Segmentation-Driven Recognition Applied to Numerical Field Extraction from Handwritten Incoming Mail Documents

In this paper, we present a method for the automatic extraction of numerical fields (zip codes, phone numbers, etc.) from incoming mail documents. The approach is based on a segmentation-driven recognition that aims at locating isolated and touching digits among the textual information. A syntactical analysis is then performed on each line of text in order to filter the sequences that respect a...

متن کامل

Neural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten

Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...

متن کامل

Discrimination Between Digits and Outliers in Handwritten Documents Applied to the Extraction of Numerical Fields

In this article, we propose a numerical field extraction system from unconstrained handwritten documents. The system is based on a segmentation driven by recognition stage followed by a syntactical analysis which detects the sequences that may compose a numerical field. We focus here on the design of a digit classifier embedded in the segmentation/recognition process able to discriminate digits...

متن کامل

Numerical Field Extraction in Handwritten Incoming Mail Documents

In this communication, we propose a method for the automatic extraction of numerical fields in handwritten documents. The approach exploits the known syntactic structure of the numerical field to extract, combined with a set of contextual morphological features to find the best label of each connected component. Applying an HMM based syntactic analyzer on the overall document allows to localize...

متن کامل

Numerical Sequence Extraction in Handwritten Incoming Mail Documents

In this communication, we propose a method for the automatic extraction of numerical fields in handwritten documents. The approach exploits the known syntactic structure of the numerical field to extract, combined with a set of contextual morphological features to find the best label to each connected component. Applying an HMM based syntactic analyzer on the overall document allows to localize...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008